---
title: Getting Started with AWS SageMaker
sidebarTitle: AWS SageMaker
description: "Learn how to use TensorZero with AWS SageMaker LLMs: open-source gateway, observability, optimization, evaluations, and experimentation."
---

This guide shows how to set up a minimal deployment to use the TensorZero Gateway with the AWS SageMaker API.

The AWS SageMaker model provider is a wrapper around other TensorZero model providers that handles AWS SageMaker-specific logic (e.g. auth).
For example, you can use it to infer self-hosted model providers like Ollama and TGI deployed on AWS SageMaker.

## Setup

For this minimal setup, you'll need just two files in your project directory:

```
- config/
  - tensorzero.toml
- docker-compose.yml
```

<Tip>

You can also find the complete code for this example on [GitHub](https://github.com/tensorzero/tensorzero/tree/main/examples/guides/providers/aws-sagemaker).

</Tip>

For production deployments, see our [Deployment Guide](/deployment/tensorzero-gateway/).

You'll also need to deploy a SageMaker endpoint for your LLM model.
For this example, we're using a container running Ollama.

### Configuration

Create a minimal configuration file that defines a model and a simple chat function:

```toml title="config/tensorzero.toml"
[models.gemma_3]
routing = ["aws_sagemaker"]

[models.gemma_3.providers.aws_sagemaker]
type = "aws_sagemaker"
model_name = "gemma3:1b"
endpoint_name = "my-sagemaker-endpoint"
region = "us-east-1"
# ... or use `allow_auto_detect_region = true` to infer region with the AWS SDK
hosted_provider = "openai"  # Ollama is OpenAI-compatible

[functions.my_function_name]
type = "chat"

[functions.my_function_name.variants.my_variant_name]
type = "chat_completion"
model = "gemma_3"
```

The `hosted_provider` field specifies the model provider that you deployed on AWS SageMaker.
For example, Ollama is OpenAI-compatible, so we use `openai` as the hosted provider.
Alternatively, you can use `hosted_provider = "tgi"` if you had deployed TGI instead.

You can specify the endpoint's `region` explicitly, or use `allow_auto_detect_region = true` to infer region with the AWS SDK.

See the [Configuration Reference](/gateway/configuration-reference/) for optional fields.
The relevant fields will depend on the `hosted_provider`.

### Credentials

You must make sure that the gateway has the necessary permissions to access AWS SageMaker.
The TensorZero Gateway will use the AWS SDK to retrieve the relevant credentials.

The simplest way is to set the following environment variables before running the gateway:

```bash
AWS_ACCESS_KEY_ID=...
AWS_REGION=us-east-1
AWS_SECRET_ACCESS_KEY=...
```

Alternatively, you can use other authentication methods supported by the AWS SDK.

### Deployment (Docker Compose)

Create a minimal Docker Compose configuration:

```yaml title="docker-compose.yml"
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway

services:
  gateway:
    image: tensorzero/gateway
    volumes:
      - ./config:/app/config:ro
    command: --config-file /app/config/tensorzero.toml
    environment:
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID:?Environment variable AWS_ACCESS_KEY_ID must be set.}
      - AWS_REGION=${AWS_REGION:?Environment variable AWS_REGION must be set.}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY:?Environment variable AWS_SECRET_ACCESS_KEY must be set.}
    ports:
      - "3000:3000"
    extra_hosts:
      - "host.docker.internal:host-gateway"
```

You can start the gateway with `docker compose up`.

## Inference

Make an inference request to the gateway:

```bash
curl -X POST http://localhost:3000/inference \
  -H "Content-Type: application/json" \
  -d '{
    "function_name": "my_function_name",
    "input": {
      "messages": [
        {
          "role": "user",
          "content": "What is the capital of Japan?"
        }
      ]
    }
  }'
```
